Data-driven pattern identification and outlier detection in time series
نویسندگان
چکیده
We address the problem of data-driven pattern identification and outlier detection in time series. To this end, we use singular value decomposition (SVD) which is a well-known technique to compute a low-rank approximation for an arbitrary matrix. By recasting the time series as a matrix it becomes possible to use SVD to highlight the underlying patterns and periodicities. This is done without the need for specifying userdefined parameters. From a data mining perspective, this opens up new ways of analyzing time series in a data-driven, bottom-up fashion. However, in order to get correct results, it is important to understand how the SVD-spectrum of a time series is influenced by various characteristics of the underlying signal and noise. In this paper, we have extended the work in earlier papers by initiating a more systematic analysis of these effects. We then illustrate our findings on some real-life data. Keywords—Data mining; time series; outliers; singular value decomposition (SVD); parameter-free approximation.
منابع مشابه
Identification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملResearch on Maximal Frequent Pattern Outlier Factor for Online High-Dimensional Time-Series Outlier Detection
Frequent pattern outlier factor is used to detect outliers with complete frequent itemsets. But it is difficult in real-world time-series data streams application because of its low efficiency. In this paper, we propose a novel maximal frequent pattern outlier factor (MFPOF) and an outlier detection algorithm (OODFP) for online high-dimensional time-series outlier detection. Firstly, the time-s...
متن کاملOutlier Detection in Wireless Sensor Networks Using Distributed Principal Component Analysis
Detecting anomalies is an important challenge for intrusion detection and fault diagnosis in wireless sensor networks (WSNs). To address the problem of outlier detection in wireless sensor networks, in this paper we present a PCA-based centralized approach and a DPCA-based distributed energy-efficient approach for detecting outliers in sensed data in a WSN. The outliers in sensed data can be ca...
متن کاملA statistical test for outlier identification in data envelopment analysis
In the use of peer group data to assess individual, typical or best practice performance, the effective detection of outliers is critical for achieving useful results. In these ‘‘deterministic’’ frontier models, statistical theory is now mostly available. This paper deals with the statistical pared sample method and its capability of detecting outliers in data envelopment analysis. In the prese...
متن کاملODD 2 Workshop on Outlier Detection & Description under Data Diversity
In this talk I will briefly discuss recent advances in outlier detection, with a focus on distance-based techniques and discuss possible future directions in the context of rank-driven interactive analysis and data-guided explanations and visualizations. Time permitting we will examine such techniques in the context of real world analysis of multi-modal data including time series, graphs, text ...
متن کامل